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DETAILED ACTION 

1 . This office action is in response to correspondence filed 04/09/1 0 regarding 
application 10/554010, in which no claims were amended. Claims 1-10 and 13 are still 
pending in the application and have been reconsidered. 

Response to Arguments 

2. The arguments on pages 4-9 of the Remarks have been considered but are not 
persuasive. 

3. On page 8, the Remarks assert "It is unknown to the Applicants how this 
disclosure of Gray (p1 Abstract) relates to the subject invention or how the Examiner is 
taking the mention of "central moving average filter" to be that same as "dividing... by 
an average of subsequent values of said extracted predetermined audio feature". 

In response, Scheirer teaches dividing the summed energy within the at least 
one predetermined frequency band by an average of values of said extracted 
predetermined audio feature because on Col 7 lines 65-67, Scheirer divides the four Hz 
modulation energy signal by the frame energy signal to get a normalized speech 
modulation energy value. However, Scheirer does not specifically mention an average 
of subsequent values. As was explained on page 4 of the office action 01/11/10, Gray 
discloses an average of subsequent values on p1 , Abstract, where Grey discloses a 
central moving average filter, which divides the present value by preceding and 
subsequent values. For example, a 10 point central moving average filter would sum the 
preceding 5 points and the subsequent 5 points of data and divide by 10. An artisan at 
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the time of the invention skilled in the art of digital signal processing would have been 
familiar with the well known forward, central, and trailing average filters, as well as 
normalizing a data sequence by dividing the data points by an average of values. 
Furthermore, the energy signal for a frame of audio data is the average energy for the 
set of samples that make up the frame. It would have been obvious to one of ordinary 
skill in the art at the time of the invention to modify the invention of Scheirer to include 
an average of subsequent values by using a central moving average filter as the moving 
average filter, in order to determine the trend at the point of greatest precision, as 
suggested by Gray (p10). 

The particular language of the claim requires only "an average of subsequent 
values" and does not rule out a central moving average filter because it does not rule 
out using preceding values in addition to subsequent values in calculating the average. 
While the claims are interpreted in light of the specification, limitations from the 
specification are not read into the claims. 

The remaining arguments on pages 8-9 assert that the other claims are allowable 
by virtue of dependence or for similar reasons to those of claim 1 , and so are not 
persuasive for the same reasons. 

Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 
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(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

5. Claims 1-3, 5, 6, 8, 9, and 13 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Scheirer et al. (6,570,991 ) in view of Gray et al. ("Design of Moving 
Average Trend Filters using Fidelity, Smoothness and Minimum Revisions Criteria". 
Statistical Research Report Series No. RR96/01 , Institute of Statistics and Operations 
Research, Victoria University of Wellington, New Zealand, 1997). 

Consider claim 1 , Scheirer discloses a method for classifying at least one audio 
signal into at least one audio class (Title), the method comprising the steps of: 

analyzing said audio signal to extract at least one predetermined audio feature 
(Fig 1, feature detector 12); 

performing a frequency analysis on a set of values of said extracted 
predetermined audio feature at different time instances resulting in a power spectrum of 
said extracted predetermined audio feature (Fig 2, different frames, Fig 7c, power 
spectrum, Col 7 lines 53-54, calculating the energy spectrogram); 

deriving at least one further audio feature representing a temporal behavior of 
said extracted predetermined audio feature (Col 7 lines 47-48, syllables per second) by 
parameterizing said power spectrum (Col 7-8 lines 65-2, normalized speech 
modulation energy), wherein parameterizing said power spectrum comprises (a) 
summarizing a frequency axis of the power spectrum by summing energy within at least 
one predetermined frequency band (Col 7 lines 59-60, energy within each of the twenty 
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channels of equal width represents the sum of energy at each frequency within the 
band) and (b) dividing (b)(i) the summed energy within the at least one predetermined 
frequency band by (b)(ii) an average of values of said extracted predetermined audio 
feature (Col 7 lines 65-67, dividing by the frame energy signal) to (c) yield a relative 
modulation depth representing an amount of envelope modulation in the at least one 
predetermined frequency band (Fig 12a, 12b); and 

classifying said audio signal based on said further audio feature (Fig 1, classifier 

16). 

Scheirer does not specifically mention an average of subsequent values. 

Gray discloses an average of subsequent values (p1, Abstract, a central moving 
average filter divides the present value by preceding and subsequent values). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the invention of Scheirer to include an average of subsequent values 
by using a central moving average filter as the moving average filter, in order to 
determine the trend at the point of greatest precision, as suggested by Gray (p10). 

Claim 8 is directed to a system for performing the method of claim 1 , and so is 
rejected for similar reasons. 

Consider claim 9, Scheirer discloses a music system comprising: 
means for playing audio data from a medium (Col 1 lines 38-40); and 
a system as named in claim 8 for classifying audio data (See claim 8). 
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Consider claim 2, Scheirer discloses at least one predetermined audio feature 
that comprises at least one of the following audio features: 
spectral centroid (Fig 3); 
zero-crossing rate (Fig 5); 
spectral roll-off frequency (Fig 6). 

Consider claim 3, Scheirer implies, or at least suggests at least one mel- 
frequency cepstral coefficient (Col 7 lines 55-56). 

Consider claim 5, Scheirer discloses: 

calculating an average value of said set of values of said extracted 

predetermined audio feature at different time instances (Col 7 lines 63-65); 

defining at least one frequency band (Col 7 lines 54-55); 

calculating the amount of energy within said frequency band from said frequency 
analysis (Col 7 lines 60-65); and 

defining said further audio feature as said amount of energy divided by said 
average value (Col 7 lines 60-65). 

Consider claim 6, Scheirer discloses at least one of the following modulation 
frequency bands are used in said parameterizing said power spectrum: 
1-2Hz 

3-1 5Hz (Col 7 lines 26-50 and lines 59-61) 
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20-1 50Hz. 

Consider claim 13, Scheirer implies, or at least suggests performing a frequency 
analysis on a set of values of said extracted predetermined audio feature at difference 
time instances results in a log power spectrum of said extracted predetermined audio 
feature (Fig 7). 

6. Claim 4 is rejected under 35 U.S.C. 103(a) as being unpatentable over Scheirer 
et al. (6,570,991 ) in view of Gray et al. ("Design of Moving Average Trend Filters using 
Fidelity, Smoothness and Minimum Revisions Criteria". Statistical Research Report 
Series No. RR96/01, Institute of Statistics and Operations Research, Victoria University 
of Wellington, New Zealand, 1997), in further view of Blum etal. (5,918,223). 

Consider claim 4, Scheirer and Gray do not specifically mention said 
predetermined audio feature comprises at least one of the psycho-acoustic audio 
features loudness and sharpness. 

Blum discloses audio feature comprises at least one of the psycho-acoustic 
audio features loudness and sharpness (Col 6 lines 45-47, brightness). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the invention of Scheirer and Gray such that said predetermined 
audio feature comprises sharpness in order to see some of the essential characteristics 
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of the sounds, as suggested by Blum (Col 6 lines 50-52), making the classification 
more accurate. 



7. Claims 7 and 10 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Scheirer et al. (6,570,991 ) in view of Gray et al. ("Design of Moving Average Trend 
Filters using Fidelity, Smoothness and Minimum Revisions Criteria". Statistical 
Research Report Series No. RR96/01 , Institute of Statistics and Operations Research, 
Victoria University of Wellington, New Zealand, 1997), in further view of Rui et al. 
(7,028,325). 



Consider claim 7, Scheirer and Gray do not, but Rui discloses at least one further 
audio feature is defined as at least one coefficient obtained by performing a discrete 
cosine transformation on the result of a frequency analysis (Col 8 lines 33-34). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the invention of Scheirer and Gray such that at least one further 
audio feature is defined as at least one coefficient obtained by performing a discrete 
cosine transformation on the result of said frequency analysis, in order to calculate the 
MFCCs, as suggested by Rui (Col 8 lines 29-36), which more accurately reflect human 
hearing by having coarser resolution at high frequencies, thereby making them a better 
feature for classification of speech and music. 



Consider claim 10, Scheirer and Gray disclose a multi-media system (Scheirer, 
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Col 1 lines 22-25) comprising: 

means for playing audio data from a medium (Scheirer, Col 1 lines 22-25); 

a system as claimed in claim 8 for classifying said audio data (See claim 8). 

Scheirer and Gray do not specifically mention means for displaying video from a 
further medium; means for analyzing said video data; and means for combining the 
results obtained from analyzing said video data with the results obtained from 
classifying said audio data. 

Rui discloses means for displaying video data from a further medium (Fig 2); 
means for analyzing said video data; and means for combining the results obtained 
from analyzing said video data with the results obtained from classifying said audio data 
(Fig 3). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the invention of Scheirer and Gray to include means for displaying 
video from a further medium; means for analyzing said video data; and means for 
combining the results obtained from analyzing said video data with the results obtained 
from classifying said audio data, in order to allow people to be entertained, as 
suggested by Rui (Col 1 lines 20-23). 

Conclusion 

8. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .136(a). 
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A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

9. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jesse Pullias whose telephone number is 
571/270-5135. The examiner can normally be reached on M-F 9:00 AM - 4:30 PM. If 
attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Hudspeth can be reached on 571/272-7843. The fax phone number 
for the organization where this application or proceeding is assigned is 571/270-6135. 

10. Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for published 
applications may be obtained from either Private PAIR or Public PAIR. Status 
information for unpublished applications is available through Private PAIR only. For 
more information about the PAIR system, see http://pair-direct.uspto.gov. Should you 
have questions on access to the Private PAIR system, contact the Electronic Business 
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Center (EBC) at 866-217-9197 (toll-free). 
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Primary Examiner, Art Unit 2626 6/25/201 0 



